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NUCLEIC ACID LABELING METHODS 



This application claims the benefit of U.S. provisional application 60/395,580, 
5 filed July 12, 2002, the disclosures of which are incorporated here by reference in their 
entirety for all purposes. 

FIELD OF THE INVENTION 
This invention relates generally to the analysis of nucleic acids using a nucleic 
10 acid microarray, and in particular, to the labeling of ribonucleic acids and hybridization 
of labeled ribonucleic acids to the nucleic acid probes on a nucleic acid microarray. This 
invention also relates to nucleic acid labeling compounds for labeling RNA. 



BACKGROUND OF THE INVENTION 
1 5 Gene expression in diseased and healthy individuals is oftentimes different and 

characterizable. The ability to monitor gene expression in such cases provides medical 

professionals with a powerful diagnostic tool. 

One can indirectly monitor gene expression, for example, by measuring a nucleic 

acid (e.g., mRNA) that is the transcription product of a targeted gene. The nucleic acid is 
20 chemically or biochemically labeled with a detectable moiety and allowed to hybridize 

with a localized nucleic acid probe of known sequences. The detection of a labeled 

nucleic acid at the probe position indicates that the targeted gene has been expressed. 
The labeling of a nucleic acid is typically performed by covalently attaching a 

detectable group (label) to either an internal or terminal position. Scientists have reported 
25 a number of detectable nucleotide analogues that have been enzymatically incorporated 

into an oligonucleotide or polynucleotide. Langer et al., for example, disclosed 

nucleotide analogues that contain a covalently bound biotin moiety. Proc. Natl. Acad. 

Sci. USA 1981, 78, 6633-6637. Lockhart et al. also disclosed a method of end-labeling a 

nucleic acid using a terminal transferase or an RNA ligase. See U. S. Patent number 
30 6,344,3 1 6, which is hereby incorporated by reference in its entirety for all purposes. 
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SUMMARY OF THE INVENTION 
In one aspect of the invention, a method is provided for end-labeling RNA (total 
RNA, mRNA, cRNA or fragmented RNA). In one embodiment, T4 RNA ligase is used 
to attach a 3'-biotinylated AMP or CMP donor to an RNA acceptor molecule. In another 
5 embodiment, a pyrophosphate of the form 3 5 -AppN-3'-linker-biotin is used as donor 
molecule to be ligated to an RNA acceptor molecule. 

In another aspect of the invention, a method is provided for analyzing a nucleic 
acid population on a nucleic acid microarray comprising providing a nucleic acid 
population or converting the nucleic acid population into nucleic acid fragments; ligating 
10 the nucleic acid population or fragments to a labeled nucleic acid molecule to form 
labeled nucleic acid population or fragments using a ligase; hybridizing the labeled 
nucleic acid population or fragments to an array of nucleic acid probes, and determining 
hybridization signals of the probes as an indication of levels of the nucleic acids in the 
nucleic acid population. 

15 

BRIEF DESCRIPTION OF THE DRAWING 
Figure 1 : Comparison of replicate end-labeled (Average Ligation) vs. internally-labeled 
cRNA (Average Standard) based on four replicates of each. End-labeling by ligation 
results in a greater number of present calls and higher target intensity (as measured by the 
20 average average difference) compared to internally-labeled cRNA. 



DETAILED DESCRIPTION OF THE INVENTION 
The present invention has many preferred embodiments and relies on many 
patents, applications and other references for details known to those of the art. Therefore, 
25 when a patent, application, or other reference is cited or repeated below, it should be 

understood that it is incorporated by reference in its entirety for all purposes as well as for 
the proposition that is recited. 

As used in this application, the singular form "a," "an," and "the" include plural 
references unless the context clearly dictates otherwise. For example, the term "an agent" 
30 includes a plurality of agents, including mixtures thereof. 

An individual is not limited to a human being but may also be other organisms 
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including but not limited to mammals, plants, bacteria, or cells derived from any of the 
above. 

Throughout this disclosure, various aspects of this invention can be presented in a 
range format. It should be understood that the description in range format is merely for 
5 convenience and brevity and should not be construed as an inflexible limitation on the 
scope of the invention. Accordingly, the description of a range should be considered to 
have specifically disclosed all the possible subranges as well as individual numerical 
values within that range. For example, description of a range such as from 1 to 6 should 
be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, 
10 from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers 

within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth 
of the range. 

The practice of the present invention may employ, unless otherwise indicated, 
conventional techniques and descriptions of organic chemistry, polymer technology, 

1 5 molecular biology (including recombinant techniques), cell biology, biochemistry, and 
immunology, which are within the skill of the art. Such conventional techniques include 
polymer array synthesis, hybridization, ligation, and detection of hybridization using a 
label. Specific illustrations of suitable techniques can be had by reference to the example 
hereinbelow. However, other equivalent conventional procedures can, of course, also be 

20 used. Such conventional techniques and descriptions can be found in standard laboratory 
manuals such as Genome Analysis: A Laboratory Manual Series (Vols. I-IV), Using 
Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A 
Laboratory Manual, and Molecular Cloning: A Laboratory Manual (all from Cold Spring 
Harbor Laboratory Press), Stryer, Biochemistry, (WH Freeman), Gait, "Oligonucleotide 

25 Synthesis: A Practical Approach" 1984, IRL Press, London, all of which are herein 
incorporated in their entirety by reference for all purposes. 

The present invention can employ solid substrates, including arrays in some 
preferred embodiments. Methods and techniques applicable to polymer (including 
protein) array synthesis have been described in U.S.S.N 09/536,841, WO 00/58516, U.S. 

30 Patents Nos. 5,143,854, 5,242,974, 5,252,743, 5,324,633, 5,384,261, 5,424,186, 

5,451,683, 5,482,867, 5,491,074, 5,527,681, 5,550,215, 5,571,639, 5,578,832, 5,593,839, 
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5,599,695, 5,624,711, 5,631,734, 5,795,716, 5,831,070, 5,837,832, 5,856,101, 5,858,659, 
5,936,324, 5,968,740, 5,974,164, 5,981,185, 5,981,956, 6,025,601, 6,033,860, 6,040,193, 
6,090,555, and 6,136,269, in PCT Applications Nos. PCT/US99/00730 (International 
Publication Number WO 99/36760) and PCT/US 01/04285, and in U.S. Patent 
5 Applications Serial Nos. 09/501 ,099 and 09/1 22,2 1 6 which are all incorporated herein by 
reference in their entirety for all purposes. Preferred arrays are commercially available 
from Affymetrix, Inc. (Santa Clara, CA). See www.affymetrix.com. 

Patents that describe synthesis techniques in specific embodiments include U.S. 
Patents Nos. 5,412,087, 6,147,205, 6,262,216, 6,310,189, 5,889,165, and 5,959,098. 

10 Nucleic acid arrays are described in many of the above patents, but the same techniques 
are applied to polypeptide arrays. 

The present invention also contemplates many uses for polymers attached to solid 
substrates. These uses include gene expression monitoring, profiling, library screening, 
genotyping, and diagnostics. Gene expression monitoring, and profiling methods can be 

15 shown in U.S. Patents Nos. 5,800,992, 6,013,449, 6,020,135, 6,033,860, 6,040,138, 

6,177,248 and 6,309,822. Genotyping and uses therefor are shown in USSN 10/013,598, 
and U.S. Patents Nos. 5,856,092, 6,300,063, 5,858,659, 6,284,460 and 6,333,179. Other 
uses are embodied in U.S. Patents Nos. 5,871,928, 5,902,723, 6,045,996, 5,541,061, and 
6,197,506. 

20 The present invention also contemplates sample preparation methods in certain 

preferred embodiments. For example, see the patents in the gene expression, profiling, 
genotyping and other use patents above, as well as USSN 09/854,3 17, Wu and Wallace, 
Genomics 4, 560 (1989), Landegren et al., Science 241, 1077 (1988), Burg, U.S. Patent 
Nos. 5,437,990, 5,215,899, 5,466,586, 4,357,421, Gubler et al., 1985, Biochemica et 

25 Biophysica Acta, Displacement Synthesis of Globin Complementary DNA: Evidence for 
Sequence Amplification, transcription amplification, Kwoh et al., Proc. Natl. Acad. Sci. 
USA 86, 1173 (1989), Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990), WO 
88/10315, WO 90/06995, and 6,361,947. 

The present invention also contemplates detection of hybridization between 

30 ligands in certain preferred embodiments. See U.S. Pat. Nos. 5,143,854, 5,578,832; 

5,631,734; 5,834,758; 5,936,324; 5,981,956; 6,025,601; 6,141,096; 6,185,030; 6,201,639; 
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6,218,803; and 6,225,625 and in PCT Application PCT/US99/ 06097 (published as 
W099/47964), each of which also is hereby incorporated by reference in its entirety for 
all purposes. 

The present invention may also make use of various computer program products 
5 and software for a variety of purposes, such as probe design, management of data, 

analysis, and instrument operation. See, U.S. Pat. Nos. 5,593,839, 5,795,716, 5,733,729, 
5,974,164, 6,066,454, 6,090,555, 6,185,561, 6,188,783, 6,223,127, 6,229,911 and 
6,308,170. 

Additionally, the present invention may have preferred embodiments that include 
10 methods for providing genetic information over the internet. See provisional application 
60/349,546. 
Definitions 

An array of oligonucleotides or polynucleotides as used herein refers to a 
multiplicity of different (sequence) oligonucleotides or polynucleotides attached 

15 (preferably through a single terminal covalent bond) to one or more solid supports where, 
when there is a multiplicity of supports, each support bears a multiplicity of 
oligonucleotides or polynucleotides. The term "array" can refer to the entire collection of 
oligonucleotides or polynucleotides on the support(s) or to a subset thereof. The term 
"same array" when used to refer to two or more arrays is used to mean arrays that have 

20 substantially the same oligonucleotide species thereon in substantially the same 

abundances. The spatial distribution of the oligonucleotide or polynucleotide species 
may differ between the two arrays, but, in a preferred embodiment, it is substantially the 
same. It is recognized that even where two arrays are designed and synthesized to be 
identical there are variations in the abundance, composition, and distribution of 

25 oligonucleotide or polynucleotide probes. These variations are preferably insubstantial 
and/or compensated for by the use of controls as described herein. The terms 
oligonucleotide and polynucleotide can be used interchangeably in this application and 
the use of one term should not appear as a limitation of the invention. 
The terms "nucleic acid" or "nucleic acid molecule" refer to a 

30 deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, 
and unless otherwise limited, would encompass known analogs of natural nucleotides that 
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can function in a similar manner as naturally occurring nucleotides. 

An oligonucleotide or polynucleotide is a single-stranded nucleic acid ranging in 
length from 2 to about 1000 nucleotides, more typically from 2 to about 500 nucleotides 
in length. 

5 As used herein a "probe" is defined as an oligonucleotide or polynucleotide 

capable of binding to a target nucleic acid of complementary sequence through one or 
more types of chemical bonds, usually through complementary base pairing, usually 
through hydrogen bond formation. As used herein, an oligonucleotide or polynucleotide 
probe may include natural (i.e. A, G, C, or T) or modified bases (7-deazaguanosine, 

10 inosine, etc.). In addition, the bases in oligonucleotide or polynucleotide probe may be 
joined by a linkage other than a phosphodiester bond, so long as it does not interfere with 
hybridization. Thus, oligonucleotide or polynucleotide probes may be peptide nucleic 
acids in which the constituent bases are joined by peptide bonds rather than 
phosphodiester linkages. Oligonucleotide or polynucleotide probes may also be 

1 5 generically referred to as nucleic acid probes. 

The term "target nucleic acid" refers to a nucleic acid (often derived from a 
biological sample and hence referred to also as a sample nucleic acid), to which the 
oligonucleotide or polynucleotide probe specifically hybridizes. It is recognized that the 
target nucleic acids can be derived from essentially any source of nucleic acids (e.g., 

20 including, but not limited to chemical syntheses, amplification reactions, forensic 

samples, etc.) It is either the presence or absence of one or more target nucleic acids that 
is to be detected, or the amount of one or more target nucleic acids that is to be 
quantified. The target nucleic acid(s) that are detected preferentially have nucleotide 
sequences that are complementary to the nucleic acid sequences of the corresponding 

25 probe(s) to which they specifically bind (hybridize). The term target nucleic acid may 
refer to the specific subsequence of a larger nucleic acid to which the probe specifically 
hybridizes, or to the overall sequence (e.g., gene or mRNA) whose abundance 
(concentration) and/or expression level it is desired to detect. The difference in usage 
will be apparent from context. 

30 The phrase "coupled to a support" means bound directly or indirectly thereto 

including attachment by covalent binding, hydrogen bonding, ionic interaction, 
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hydrophobic interaction, or otherwise. 

"Transcribing a nucleic acid" means the formation of a ribonucleic acid from a 
deoxyribonucleic acid and the converse (the formation of a deoxyribonucleic acid from a 
ribonucleic acid). A nucleic acid can be transcribed by DNA-dependent RNA 
5 polymerase, reverse transcriptase, or otherwise. 

A labeled moiety means a moiety capable of being detected by the various 
methods discussed herein or known in the art. 

"Bind(s) substantially" refers to complementary hybridization between a probe 
nucleic acid and a target nucleic acid and embraces minor mismatches that can be 

1 0 accommodated by reducing the stringency of the hybridization media to achieve the 
desired detection of the target oligonucleotide or polynucleotide sequence. 

The phrase "hybridizing specifically to", refers to the binding, duplexing, or 
hybridizing of a molecule preferentially to a particular nucleotide sequence under 
stringent conditions when that sequence is present in a complex mixture (e.g., total 

15 cellular) DNA or RNA. The term "stringent conditions" refers to conditions under which 
a probe will hybridize preferentially to its target subsequence, and to a lesser extent to, or 
not at all to, other sequences. Stringent conditions are sequence-dependent and will be 
different in different circumstances. Longer sequences hybridize specifically at higher 
temperatures. Generally, stringent conditions are selected to be about 5°C lower than the 

20 thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. 
The Tm is the temperature (under defined ionic strength, pH, and nucleic acid 
concentration) at which 50% of the probes complementary to the target sequence 
hybridize to the target sequence at equilibrium. (As the target sequences are generally 
present in excess, at Tm, 50% of the probes are occupied at equilibrium). Typically, 

25 stringent conditions will be those in which the salt concentration is at least about 0.01 to 
1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least 
about 30°C for short probes (e.g., 10 to 50 nucleotides). Stringent conditions may also be 
achieved with the addition of destabilizing agents such as formamid. 

The terms "background" or "background signal intensity" refer to hybridization 

30 signals resulting from non-specific binding, or other interactions, between the labeled 
target nucleic acids and components of the oligonucleotide or polynucleotide array (e.g., 
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the oligonucleotide or polynucleotide probes, control probes, the array substrate, etc.). 
Background signals may also be produced by intrinsic fluorescence of the array 
components themselves. A single background signal can be calculated for the entire 
array, or a different background signal may be calculated for each region of the array. In 
5 a preferred embodiment, background is calculated as the average hybridization signal 
intensity for the lowest 1% to 10% of the probes in the array, or region of the array. In 
expression monitoring arrays (i.e., where probes are preselected to hybridize to specific 
nucleic acids (genes)), a different background signal may be calculated for each target 
nucleic acid. Where a different background signal is calculated for each target gene, the 

10 background signal is calculated for the lowest 1% to 10% of the probes for each gene. Of 
course, one of skill in the art will appreciate that where the probes to a particular gene 
hybridize well and thus appear to be specifically binding to a target sequence, they should 
not be used in a background signal calculation. Alternatively, background may be 
calculated as the average hybridization signal intensity produced by hybridization to 

15 probes that are not complementary to any sequence found in the sample (e.g. probes 

directed to nucleic acids of the opposite sense or to genes not found in the sample such as 
bacterial genes where the sample is of mammalian origin). Background can also be 
calculated as the average signal intensity produced by regions of the array that lack any 
probes at all. 

20 The term "quantifying" when used in the context of quantifying nucleic acid 

abundances or concentrations (e.g., transcription levels of a gene) can refer to absolute or 
to relative quantification. Absolute quantification may be accomplished by inclusion of 
known concentration(s) of one or more target nucleic acids (e.g. control nucleic acids 
such as BioB or with known amounts the target nucleic acids themselves) and referencing 

25 the hybridization intensity of unknowns with the known target nucleic acids (e.g. through 
generation of a standard curve). Alternatively, relative quantification can be 
accomplished by comparison of hybridization signals between two or more genes, or 
between two or more treatments to quantify the changes in hybridization intensity and, by 
implication, transcription level. 

30 Nucleic acid labeling 

In one aspect of the present invention, the hybridized nucleic acids are detected by 
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detecting one or more labels attached to the sample nucleic acids. The labels may be 
incorporated by any of a number of means well known to those of skill in the art. 
However, in a preferred embodiment, the label is simultaneously incorporated during the 
amplification step in the preparation of the sample nucleic acids. For example, 
5 polymerase chain reaction (PCR) with labeled primers or labeled nucleotides will provide 
a labeled amplification product. The nucleic acid (e.g., DNA) is be amplified in the 
presence of labeled deoxynucleotide triphosphates (dNTPs). The amplified nucleic acid 
can be fragmented, exposed to an oligonucleotide array, and the extent of hybridization 
determined by the amount of label now associated with the array. In a preferred 
10 embodiment, transcription amplification, as described above, using a labeled nucleotide 
(e.g. fluorescein-labeled UTP and/or CTP) incorporates a label into the transcribed 
nucleic acids. 

Alternatively, a label may be added directly to the original nucleic acid sample 
(e.g., mRNA, polyA mRNA, cDNA, etc.) or to the amplification product after the 
15 amplification is completed. Such labeling can result in the increased yield of 

amplification products and reduce the time required for the amplification reaction. 
Means of attaching labels to nucleic acids include, for example nick translation or end- 
labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent 
attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label 
20 (e.g., a fluorophore). 

In many applications it is useful to directly label nucleic acid samples without 
having to go through amplification, transcription or other nucleic acid conversion step. 
This is especially true for monitoring of mRNA levels where one would like to extract 
total cytoplasmic RNA or poly A+ RNA (mRNA) from cells and hybridize this material 
25 without any intermediate steps. See U. S. Patent No. 6,344,316, which is hereby 
incorporated by reference in its entirety for all purposes. 

End labeling can be performed using terminal transferase (TdT). End labeling 
can also be accomplished by ligating a labeled nucleotide or oligonucleotide or 
polynucleotide or analog thereof to the end of a target nucleic acid or probe. See U. S. 
30 Patent No. 6,344,316. 

According to one aspect of the present invention, methods of end labeling a 
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nucleic acid and reagents useful therefore are described. In one preferred embodiment of 
the present invention, the method involves providing a nucleic acid, providing a labeled 
nucleotide or oligonucleotide or polynucleotide and enzymatically ligating the nucleotide 
or oligonucleotide or polynucleotide to the nucleic acid. Thus, according to one aspect of 
5 the present invention, where the nucleic acid is an RNA, a labeled ribonucleotide can be 
ligated to the RNA using an RNA ligase. RNA ligase catalyzes the covalent joining of 
single-stranded RNA (or DNA, but the reaction with RNA is more efficient) with a 5' 
phosphate group to the 3'-OH end of another piece of RNA (or DNA). The specific 
requirements for the use of this enzyme are described in The Enzymes, Volume XV, Part 

10 B, T4 RNA Ligase, Uhlenbeck and Greensport, pages 31-58; and 5.66-5.69 in Sambrook 
et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Cold Spring 
Harbor, New York (1982), all of which are incorporated here by reference in full. 

According to one aspect of the present invention, a method is provided for adding 
a label to a nucleic acid (e.g. extracted RNA) directly rather than incorporating labeled 

15 nucleotides in a nucleic acid polymerization step. According to one aspect of the present 
invention this may be accomplished by adding a labeled ribonucleotide or short labeled 
oligoribonucleotide to the ends of a single stranded nucleic acid, 

RNA can be randomly fragmented with heat in the presence of Mg 2+ . This 
generally produces RNA fragments with 5' OH groups and phosphorylated 3* ends. 

20 According to one aspect of the present invention, alkaline phosphatase is used to remove 
the phosphate group from the 3' ends of the RNA fragment. In accordance with one 
aspect of the present invention, a donor comprising a ribonucleotide having a detectable 
label and having a 5 '-terminal phosphate is then ligated to the 3' OH group of the RNA 
fragments using T4 RNA ligase to provide a labeled RNA. The donor is also called, in 

25 accordance with the present invention, a nucleic acid labeling compound. 

T4 RNA ligase catalyzes ligation of a 5' phosphoryl-terminated nucleic acid 
donor to a 3' hydroxyl-terminated nucleic acid acceptor through the formation of a 3' to 
5' phosphodiester bond, with hydrolysis of ATP to AMP and PPi. Although the minimal 
acceptor must be a trinucleoside diphosphate, dinucleoside pyrophosphates (NppN) and 

30 mononucleoside 3',5'-disphosphates (pNp) are effective donors in the intermolecular 
reaction. See Hoffmann and McLaughlin, Nuc. Acid. Res. 15, 5289-5303 (1987), which 
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is hereby incorporated by reference in its entirety for all purposes. 

Detectable labels suitable for use in the present invention include any composition 
detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, 
optical or chemical means. Useful labels in the present invention include biotin for 
5 staining with labeled streptavidin conjugate, magnetic beads (e.g., DynabeadsTM), 
fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and 
the like, see, e.g., Molecular Probes, Eugene, Oregon, USA), radiolabels (e.g., 3H, 1251, 
35S, 14C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and 
others commonly used in an ELISA), and colorimetric labels such as colloidal gold (e.g., 

10 gold particles in the 40 -80 nm diameter size range scatter green light with high 

efficiency) or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. 
Patents teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 
3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241. 

A fluorescent label is preferred because it provides a very strong signal with low 

15 background. It is also optically detectable at high resolution and sensitivity through a 
quick scanning procedure. The nucleic acid samples can all be labeled with a single 
label, for example, a single fluorescent label Alternatively, in another embodiment, 
different nucleic acid samples can be simultaneously hybridized where each nucleic acid 
sample has a different label. For instance, one target could have a green fluorescent label 

20 and a second target could have a red fluorescent label. The scanning step will distinguish 
cites of binding of the red label from those binding the green fluorescent label. Each 
nucleic acid sample (target nucleic acid) can be analyzed independently from one 
another. 
Hybridization 

25 Nucleic acid hybridization simply involves providing a denatured probe and target 

nucleic acid under conditions where the probe and its complementary target can form 
stable hybrid duplexes through complementary base pairing. The nucleic acids that do 
not form hybrid duplexes are then washed away leaving the hybridized nucleic acids to 
be detected, typically through detection of an attached detectable label. It is generally 

30 recognized that nucleic acids are denatured by increasing the temperature or decreasing 
the salt concentration of the buffer containing the nucleic acids, or in the addition of 
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chemical agents, or the raising of the pH. Under low stringency conditions (e.g., low 
temperature and/or high salt and/or high target concentration) hybrid duplexes (e.g., 
DNA.DNA, RNA:RNA, or RNA.DNA) will form even where the annealed sequences are 
not perfectly complementary. Thus specificity of hybridization is reduced at lower 
5 stringency. Conversely, at higher stringency (e.g., higher temperature or lower salt) 
successful hybridization requires fewer mismatches. 

One of skill in the art will appreciate that hybridization conditions may be 
selected to provide any degree of stringency. In a preferred embodiment, hybridization is 
performed at low stringency in this case in 6X SSPE-T at about 40°C to about 50°C 

10 (0.005% Triton X-100) to ensure hybridization and then subsequent washes are 

performed at higher stringency (e.g., 1 X SSPE-T at 37°C) to eliminate mismatched 
hybrid duplexes. Successive washes may be performed at increasingly higher stringency 
(e.g., down to as low as 0.25 X SSPE-T at 37°C to 50°C) until a desired level of 
hybridization specificity is obtained. Stringency can also be increased by addition of 

1 5 agents such as formamide. Hybridization specificity may be evaluated by comparison of 
hybridization to the test probes with hybridization to the various controls that can be 
present (e.g., expression level control, normalization control, mismatch controls, etc.). 

In general, there is a tradeoff between hybridization specificity (stringency) and 
signal intensity. Thus, in a preferred embodiment, the wash is performed at the highest 

20 stringency that produces consistent results and that provides a signal intensity greater 
than approximately 10% of the background intensity. Thus, in a preferred embodiment, 
the hybridized array may be washed at successively higher stringency solutions and read 
between each wash. Analysis of the data sets thus produced will reveal a wash stringency 
above which the hybridization pattern is not appreciably altered and which provides 

25 adequate signal for the particular oligonucleotide or polynucleotide probes of interest. 

In a preferred embodiment, background signal is reduced by the use of a detergent 
(e.g., C-TAB) or a blocking reagent (e.g., sperm DNA, cot-1 DNA, etc.) during the 
hybridization to reduce non-specific binding. In a particularly preferred embodiment, the 
hybridization is performed in the presence of about 0.1 to about 0.5 mg/ml DNA (e.g., 

30 herring sperm DNA). The use of blocking agents in hybridization is well known to those 
of skill in the art (see, e.g., Chapter 8 in P. Tijssen, supra.) 
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The stability of duplexes formed between RNAs or DNAs are generally in the 
order of RNA:RNA > RNA:DNA > DNA:DNA, in solution. Long probes have better 
duplex stability with a target, but poorer mismatch discrimination than shorter probes 
(mismatch discrimination refers to the measured hybridization signal ratio between a 
5 perfect match probe and a single base mismatch probe). Shorter probes (e.g., 8-mers) 
discriminate mismatches very well, but the overall duplex stability is low. 

Altered duplex stability conferred by using oligonucleotide or polynucleotide 
analogue probes can be ascertained by following, e.g., fluorescence signal intensity of 
oligonucleotide or polynucleotide analogue arrays hybridized with a target 
10 oligonucleotide or polynucleotide over time. The data allow optimization of specific 

hybridization conditions at, e.g., room temperature (for simplified diagnostic applications 
in the future). 

Another way of verifying altered duplex stability is by following the signal 
intensity generated upon hybridization with time. Previous experiments using DNA 

15 targets and DNA chips have shown that signal intensity increases with time, and that the 
more stable duplexes generate higher signal intensities faster than less stable duplexes. 
The signals reach a plateau or "saturate" after a certain amount of time due to all of the 
binding sites becoming occupied. These data allow for optimization of hybridization, and 
determination of the best conditions at a specified temperature. Methods of optimizing 

20 hybridization conditions are well known to those of skill in the art (see, e.g., Laboratory 
Techniques in Biochemistry and Molecular Biology, Vol. 24: Hybridization With Nucleic 
Acid Probes, P. Tijssen, ed. Elsevier, N.Y., (1993)). 
Labeled nucleotides 

According to one aspect of the present invention, T4 RNA ligase is used to 

25 enzymatically incorporate a nucleic acid labeling compound into an RNA or fragmented 
RNA population. T4 RNA ligase catalyzes ligation of a 5' phosphoryl-terminated 
nucleic acid donor to a 3' hydroxyl-terminated nucleic acid acceptor through the 
formation of a 3' to 5' phosphodiester bond, with hydrolysis of ATP to AMP and PPi. 
Although the minimal acceptor must be a trinucleoside diphosphate, dinucleoside 

30 pyrophosphates (NppN) and mononucleoside 3',5'-disphosphates (pNp) are effective 

donors in the intermolecular reaction. See, for example, Richardson, R. W. and Gumport, 

14 
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R.I. (1983), Nuc. Acid Res: 11, 6167-6185 and England, T.E., Bruce, A.G., and 
Uhlenbeck, O.C. (1980), Meth. Enzymol 65, 65-74, which are hereby incorporated by 
reference in its entirety for all purposes. 

According to one aspect of the present invention, a method is disclosed for end- 
5 labeling fragmented RNA (total RNA, mRNA or cRNA) prior to hybridization to a DNA 
microarray. The system uses T4 RNA ligase to attach a 3'-biotinylated AMP (or CMP) 
donor to the 3*-end of an RNA acceptor molecule. T4 RNA ligase catalyses the 
formation of an internucleotide phosphodiester bond between an oligonucleotide or 
polynucleotide donor molecule with a 5 '-terminal phosphate and an oligonucleotide or 
10 polynucleotide acceptor molecule with a 3' -terminal hydroxyl. Although the minimal 
acceptor must be a trinucleoside diphosphate, dinucleoside pyrophosphates (NppN) and 
mononucleoside 3 5 ,5'-disphosphates (pNp) are effective donors in the intermolecular 
reaction. 

This technique can be used to label an RNA target and uses commonly available 
15 labeling moieties and enzymes. cRNA can be produced using current GeneChip® Array 
(Affymetrix, Inc., Santa Clara, CA) expression protocols (except in vitro transcription is 
performed with standard nucleotides) followed by dephosphorylation and ligation to an 
appropriate nucleic acid labeling compound as disclosed with respect to the present 
invention. 

20 In accordance with one aspect of the present invention, the nucleic acid labeling 

compound, also called a donor, is herein exemplified without limitation by a labeled 
nucleic acid molecule, e.g., 5'-pA-3'-linker-biotin or 5'-pC-3'-linker-biotin, or 3'-AppC- 
3'-linker-biotin, as shown in the following examples. 

According to one aspect of the present invention, a nucleic acid labeling 

25 compound (also sometimes called a donor) which may be used in accordance with the 
present invention has the following structure: 
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wherein B is a heterocylic moiety; X is a functional group which permits attachment of 
the nucleic acid labeling compound to the 3' OH group of an RNA; Y is -H, -OH, -OR, - 
SR, -NHR, or a halogen, preferably -F, wherein R is an alkyl or aryl group; L is a linker 
5 and/or spacer group; and Sig is a detectable label. Preferably, X is selected from the 
group consisting of HO-, P0 4 ~, P2O7-, and P 3 O] 0 - having appropriate counter ions such as 
H + , Li + , Na + , NH4 + or K + . X is also preferably a nucleoside diphosphate such as App or 
Cpp. 



10 labeling compound is preferably a ribonucleotide. In another preferred embodiment of 
the present invention, Y is preferably R0-, RS-, RNH-, or F- wherein R is an alkyl group, 
preferably methyl In another preferred embodiment, X is PO4-. In a particularly 
preferred embodiment, L is -CH 2 -CH(0H)-CH2-(0-CH2-CH 2 )3-CH2-CH2-NH-. In yet 
another preferred embodiment, Sig is biotin. In preferred embodiments, Sig may have 

15 multiple biotin groups which may act to boost or enhance the ability of the Sig group to 
be detected. In a still further preferred embodiment, B is a nucleotide or deoxynucleotide 
base, a nucleoside or deoxynucleoside base, or natural or unnatural analogues thereof 
Preferably, B is selected from the group selected of natural bases A, C, G or U. Most 
preferably, B is selected from the group of A and C. 

20 According to one aspect of the present invention, a nucleic acid labeling 

compound is preferably the following molecule: 



According to one aspect of the present invention, Y is preferably -OH, and the 




0=P-OOH 




pCH 2 CH2(OCH2CH 2 )2CH 2 CH2-NH-biotin 



Another preferred nucleic acid labeling reagent according to the present invention 
25 has the following structure: 
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O-P-0 




o 



Yet another preferred nucleic acid labeling reagent according to the present 
invention has the structure: 



^^J^^pCH 2 CH2(OCH2CH2)2CH 2 CH2-NH-biotin 

wherein A is nucleoside base adenine and C is nucleoside base cytosine. 

In yet another aspect of the present invention, the nucleic acid labeling reagent 
has a Sig group incorporating multiple biotin groups such as shown 
below: 



O 



0 
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NH, 



N=< 

Q °oV 



°. o 
p 

°' b-(Peg)KTegB>(Peg)KTegB)KPegHTegB>(Peg)-(TegB)-(Peg)-(TegB) 



where Peg is the unit derived from the reaction of compound 1 and TegB is the unit 
derived from the reaction of compound 2 in standard, solid-phase DNA synthesis 
chemistry : 

9 

DMTO^^° v ^0^^ 0, ^0'^ 0> ^0' P 'N(iPr) 2 




tegB tegB tegB tegB tegB 
In yet another aspect of the present invention, the nucleic acid labeling reagent 
may have fewer biotin groups such as two as shown below: 




tegB tegB 



In yet another aspect of the present invention, the nucleic acid labeling reagent 
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may have three biotins as shown below: 



5 



0 o 








peg-j-peg 


-jpeg-n 
tegB tegB 


tegB 



The invention will be further understood by the following non-limiting examples. 

Example 1 

10 Procedure for the synthesis of 5 -pCp-3Minker-biotin and S'-pAp^'-linker-biotin 

This compound was made using commercially available reagents by the solid 

phase phosphoramidite chemistry approach. See, e.g., U.S. Pat. No. 4,415,732; McBride, 

L. and Caruthers, M, Tetrahedron Letters, 24:245-248 (1983); and Sinha, N. et al. Nuc. 

Acids Res, 12:4539-4557 (1984), both of which are hereby incorporated by reference. 
15 The S'-biotinylated linker derives from commercially available BiotinTEG solid support 

(Glenn Research, Sterling, Virginia). 

Example 2 

Procedure for the synthesis of S'-AppC-S-linker-biotin 

This material was synthesized by the reaction of 5'-pCp-3'-linker-biotin with 10 
20 equivalents of adenosine 5'-monophosphoromorpholidate (Sigma-Aldrich, St. Louis, 
MO) in DMF at 100 degrees Celsius, followed by purification using reverse-phase and 
ion-exchange chromatography as shown in the following scheme. 



19 
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AC-teg-biotin pyrophosphate 



O N-P-O-^ o a 
HO OH 



1 



0 

O-P-O-% o n 

0 f\>H 
0=P'° OH 

( i > ^J^ / OCH2CH2(OCH2CH 2 )2CH 2 CH2-NH-biotin 
2 

DMF,100°C 



HO OH q OH 

0=P-°OH 

3 6 X ^OCH 2 CH2(OCH2CH 2 ) 2 CH 2 CH2-NH-biotin 



Example 3 

5 Controls and preparation of sample 

Unlabeled cRNA was prepared from total RNA (1 ug of human heart RNA as 
starting material in these data) according to the recommended GeneChip expression 
protocols (Affymetrix, Inc,, Santa Clara, CA), except that unlabeled ribonucleotides were 
used for in vitro transcription. In a typical reaction, ten micrograms of the cRNA was 

10 fragmented in the standard fragmentation buffer (40mM Tris-acetate, 30 mM magnesium 
acetate, 100 mM potassium acetate) and dephosphorylated with Shrimp Alkaline 
Phosphatase (Amersham Biosciences, Piscataway, NJ) at a final concentration of 0.01 
U/uL The Shrimp Alkaline Phosphatase was then heat inactivated at 65°C for 15 
minutes, and the reactions were purified by ethanol precipitation. The fragmented cRNA 

15 was placed into a ligation reaction containing 100 uM 3'biotin-CMP with 2 U/ul T4 RNA 
Ligase (New England Biolabs, Beverly, MA) and 16% PEG in the recommended buffer 

20 
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for 2 hours at 37°C. The ligation reaction was then added to a hybridization cocktail 
containing 0.5 mg/ml Acetylated BSA (Invitrogen Life Technologies, Carlsbad, CA), 0.1 
mg/ml Herring Sperm DNA (Promega, Madison, WI), 50 pM Oligo B2 (Affymetrix Inc., 
Santa Clara, CA) and IX Eukaryotic Hybridization Controls (Affymetrix Inc.), making 
5 up a total volume of 220 ul. 200 ul labeled cRNA target were hybridized to Affymetrix 
HuU95Av2 arrays for 16 hours at 45°C. Standard wash and stain protocols were used as 
recommended in the GeneChip Expression Analysis technical manual. Analyses were 
carried out using Affymetrix Microarray Suite Version 4.0. 

Figure 1 shows the percent present calls and average-average difference of end- 

10 labeled RNA and internally-labeled RNA. The average-average difference is the 

intensity of the perfect match probe minus the intensity of the mis-match probe averaged 
over all probe sets on the microarray and is a measure of the overall signal intensity. The 
percent present call is an output of the Micro Array Suite (Affymetrix, Inc., Santa Clara, 
CA) software based on gene probe set intensities. Both are considered metrics for 

1 5 labeling efficiency and RNA integrity. A greater number of genes are called present 
using the ligation method than using internally-labeled RNA. Furthermore, the 
fluorescent signal (as measured by the average average difference) is higher for the 
ligation method. 

To test the reproducibility of the ligation labeling method, four independent 
20 reactions were carried through starting from total human heart RNA using the 

recommended GeneChip (Affymetrix, Inc., Santa Clara, CA) expression protocols, 
except that unlabeled ribonucleotides were used for in vitro transcription.. Forty-five ug 
of the resulting cRNA were fragmented and treated with Shrimp Alkaline Phosphatase at 
a final concentration of 0.01 U/ul in duplicate 51 ul reactions at 37°C for 1 hr. The 
25 Shrimp Alkaline Phosphatase was then heat inactivated at 65°C for 15 minutes, and the 
reactions were purified by ethanol precipitation. 1 1 ug fragmented, dephosphorylated 
cRNA were ligated to 100 uM 3'biotin-CMP with 2 U/ul T4 RNA Ligase and 16% PEG 
for 2 hours at 37°C in duplicate 33 ul reactions. Each 33 ul ligation reaction was then 
added to a hybridization cocktail containing 0.5 mg/ml Acetylated BSA (Invitrogen Life 
30 Technologies), 0.1 mg/ml Herring Sperm DNA, 50 pM Oligo B2 and IX Eukaryotic 

Hybridization Controls, making up a total volume of 220 ul. 200 ul labeled cRNA target 
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were hybridized to HuU95Av2 arrays (Affymetrix, Inc., Santa Clara, CA) for 16 hours at 
45°C. Standard wash and stain protocols were used as recommended in the GeneChip 
Expression Analysis technical manual (Affymetrix, Inc. Santa Clara, CA). Analyses 
were carried out using Microarray Suite Version 4.0 (Affymetrix, Inc. Santa Clara, CA). 
5 A quantitative comparison of the expression data from the replicate reactions 

produces a correlation coefficient (R 2 ) of 0.98-0.99 between the replicates, underscoring 
the high reproducibility of the end-labeling method. Comparing the end-labeled 
replicates to internally-labeled RNA produces an R 2 value between 0.88-0.94. 

Example 4 

10 Table 1 summarizes nucleic acid labeling reagents of the present invention (which 
are also described in greater detail above) and also provides convenient abbreviations 
(RLR=RNA Labeling Reagent): 
Tab lei: RNA labeling reagents 



Nomenclature Compound 



RLR-4a 


5 ' -pAp-teg-biotin-3 ' 


RLR-4b 


5 '-p A5p-teg-biotin-3 ' 


RLR-5 


5'-pCp-teg-biotin-3' 


RLR-6 


A(5')pp(5')Cp-teg- 




biotin-3' 


RLR-7 


5'-pCp-(teg-biotin) 5 -3' 


RLR-8 


5'-pCp-(teg-biotin) 2 -3' 


RLR-9 


5'-pCp-(teg-biotin) 3 -3' 



Example 5 

15 Ligation efficiency of RLR-4a and RLR-4b 

The goal of these experiments was to demonstrate the concept of ligation- 
mediated labeling and determine the labeling efficiency of two different RNA Labeling 
Reagents (RLRs): RLR-4a, pAp-biotin and RLR-4b, pA 5 -biotin [10]. RLR concentration 
(1 uM to 250 uM) and T4 RNA Ligase concentration (1 U/ul to 4 U/ul) were tested as 

20 well as ligation time (4 hr. and 8 hr). One reaction without T4 RNA Ligase served as a 
negative control. Another ligation reaction omitted the cRNA dephosphorylation step in 
order to test the requirement for dephosphorylation. All the reactions were performed 
with human heart RNA (Ambion) and were hybridized to Human U95Av2 arrays under 
standard conditions (10 ug labeled cRNA hybridized for 16 hours in IX hybridization 
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solution [100 mM MES, 1M Na+, 20 mM EDTA, 0.01% Tween20] at 45°C, 60 rpm). 
The arrays were washed, stained (using single stain protocol), and scanned according to 
the standard Affymetrix protocols. 

The following concentrations were tested for each RLR compound; 50 uM, 10 
5 uM, and 1 uM. Ligation took place for 4 hours at 30°C with 2U/ul T4 RNA Ligase (New 
England Biolabs). One RELA sample was not treated with Shrimp Alkaline Phosphatase 
and ligated with RLR-4a at 50 uM for comparison. 

Both the signal (AvgAvgDifference) and the present call rate (%P) were 
improved by increasing RLR concentration and T4 RNA Ligase concentration 
10 independently. The best performance was achieved by increasing RLR concentration in 
conjunction with enzyme concentration. RLR-4a, the monomer, consistently performed 
better than RLR-4b, the five-mer, at the same concentrations. No significant difference 
was observed between a 4 hour incubation and an 8 hour incubation. Dephosphorylation 
of the cRNA is necessary for efficient ligation, as demonstrated by the low signal and 
15 number of present calls in the "50 uM RLR-4a un-SAP'd" sample. Background intensity 
was comparable across all the arrays. At this stage, the optimum reaction conditions 
were 250 uM RLR-4a, 4 U/ul T4 RNA Ligase for 4 hours at 30°C. These experiments 
demonstrate the viability of end-labeling RNA for use with DNA microarrays. 

Example 6 

20 Ligation efficiency of RLR-5 

We next tested a range of RLR- 5 (5'-pCp-teg-biotin-3') concentrations in the 

ligation reaction. In the literature, 5'-[ 32 P]pCp-3' is putatively the preferred donor 

molecule under most radio-labeling conditions [5]. We tested the following range of 

RLR-5 concentrations at 20°C: 50 uM, 100 uM, 250 uM, 500 uM and 1000 uM. In 
25 addition, two 250 uM RLR-5 reactions were incubated at 30°C and 37°C for comparison 

to the 20°C reaction temperature. The ligation reactions were carried out using human 

heart RNA, 2 U/ul T4 RNA Ligase and 16% PEG for 2 hours. 

All of the RLR-5 samples gave equivalent or better signals compared to the 

standard. The 100 uM RLR-5 sample gave the highest signal, but this difference may be 
30 within experimental error. There was no significant difference in signal between the 

20°C, 30°C and 37°C incubation temperatures of the 250 uM RLR-5 sample. However, 
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the 37°C incubation of 250 uM RLR-5 gave the best overall present call rate of all the 
conditions tested. RLR-5 concentrations between 50 uM - 250 uM gave equivalent or 
better present call rates compared to the standard; RLR-5 concentrations greater than or 
equal to 500 uM may be slightly inhibitory as demonstrated by the slightly lower present 
5 calls, although signal intensity remained high. These experiments demonstrate that RLR- 
5 slightly outperforms RLR-4a: at 50 uM RLR concentration, 16% PEG and 20°C, RLR- 
5 has slightly higher signal and present calls than RLR-4a (50 uM RLR-4a, 16% PEG, 
20°C: 32.0%P, 104 unsealed signal; 93 scaled signal). 

The R 2 correlation between the standard method and RELA method ranged from 
10 0.93-0.94. The R 2 correlation between different ligation reactions ranged from 0.97-0.99, 
which is comparable to the variance of the standard labeling method. 

Example 7 

Ligation efficiency of RLR-6 

In this experiment, we tested the performance of RLR-6, A(5')pp(5')Cp-teg- 
15 biotin-3', the adenylated donor intermediate, at different concentrations in the ligation 
reaction. We also measured the kinetics of the reaction, comparing the effect of ligation 
time, RLR concentration, and enzyme concentration on array performance. The 
following seven reactions were carried out using human heart RNA on U95 Av2 arrays: 



20 


1) 


Standard (internally-labeled cRNA) 






2) 


50 uM RLR-6 20 min. 


2.0U/ul T4 RNA Ligase 




3) 


100 uM RLR-6 


20 min. 


2.0 U/ul T4 RNA Ligase 




4) 


200 uM RLR-6 


20 min. 


2.0 U/ul T4 RNA Ligase 




5) 


100 uM RLR-6 


5 min. 


2.0 U/ul T4 RNA Ligase 


25 


6) 


100 uM RLR-6 


120 min. 


2.0 U/ul T4 RNA Ligase 




7) 


100 uM RLR-6 


20 min. 


0.5 U/ul T4 RNA Ligase 



After only 20 minutes, the 100 uM and 200 uM concentrations of RLR-6 with 2 
U/ul T4 RNA Ligase gave equivalent or better signal compared to the standard. Signal 
30 increases as the reaction time is increased from 5 minutes to 20 minutes to 120 minutes in 
the 100 uM RLR-6 reaction. Similarly, signal increases as RLR-6 concentration 
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increases from 50 uM to 100 uM to 200 uM with the 20 minute ligation time. The 
highest signal was achieved with the 100 uM RLR-6, 2 U/ul T4 RNA Ligase, 120 minute 
reaction; the signal correlated well with that of the standard, with an R correlation of 
0.93. The next highest signal was achieved with the 200 uM, 20 minute ligation, which 
5 had an R 2 correlation of 0.94 compared to the standard. 

In terms of enzyme concentration, using 0.5 U/ul T4 RNA Ligase, or one-quarter 
of the normal amount, reduced the signal by half. The present call results followed the 
same trend as the signal results. With the exception of the 5 minute reaction and 0.5 U/ul 
Ligase reaction, all of the reactions resulted in an equivalent or better number of present 
10 calls compared to the standard. The condition that gave the best overall result was the 
100 uM RLR-6, 2 U/ul T4 RNA Ligase, 2 hr. reaction. The next best result came from 
the 200 uM, 20 minute reaction, which had comparable present calls but slightly slower 
signal. 

We also tested the need for ATP in the ligation reaction with RLR-6. Because 
1 5 RLR6 is a pre-adenylated donor molecule, ATP should not be necessary in the ligation 
reaction and could possibly be inhibitory [7]. Indeed, the above reactions were 
performed without ATP, demonstrating that ATP is not necessary for efficient ligation 
with RLR-6. We found that the presence of ATP does have a slight inhibitory effect. 

Example 8 

20 Ligation Reaction Additives 

In this experiment we sought to increase ligation efficiency by adding substances 
known to enhance various enzymatic reactions involving nucleic acids. Reports in the 
literature suggest that additives, such as BSA, DMSO and PEG can improve the ligation 
efficiency for some substrates [11, 12]. Starting from fragmented, dephosphorylated 

25 human heart cRNA, we tested ligation with the following additives: 1)10 ug/ml BSA, 2) 
10% DMSO, 3) 16% PEG 8000, 4) no additive (control). The four reactions were carried 
out with 2U/ul T4 RNA Ligase (from NEB) and 50 uM RLR-4a (suboptimal ligation 
conditions). A fifth ligation reaction using Promega T4 RNA Ligase without additives 
was included for a vendor comparison. The ligation reactions were hybridized to 

30 U95Av2 arrays under standard conditions. 

Of the three additives only PEG had a significant effect on ligation efficiency in 
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terms of array performance. The addition of 16% PEG dramatically increased overall 
signal intensity and present call percentage compared to the no additive control. BSA 
appeared to hinder ligation, as demonstrated by the lower signal intensity and lower 
present call rate. The DMSO did not have an effect on signal or present call rate. In 
5 terms of enzyme performance, the NEB T4 RNA Ligase was much more effective than 
the Promega version, which had the lowest signal and present call rate of all the 
conditions tested. 

We set out to identify the optimal PEG concentration in the ligation reaction. As 
with the previous optimization experiments, we tested ligation under suboptimal 

10 conditions in order to discern subtle differences between the different conditions tested. 
The ligation reactions were carried out with 2 U/ul T4 RNA Ligase and 50 uM RLR-4a at 
20°C for 2 hr. with the following concentrations of PEG: 0%, 10%, 16%, and 25%. We 
also tested a higher concentration of RLR-4a, 179 uM, plus or minus 16% PEG. The 
ligations were hybridized to U95Av2 arrays under standard conditions. 

1 5 Increasing the PEG concentration in the ligation reaction increased both the signal 

and the present call percentage. Within the 50 uM RLR-4a subset, the best signal was 
achieved with the 25% PEG ligation reaction. In terms of present calls, the 16% PEG 
and 25% PEG ligations gave equivalent results, exceeding the present call percentage of 
the standard by -2%. The addition of PEG proved beneficial even at the highest RLR- 

20 4a concentration tested, 179 uM. The addition of 16% PEG increased the signal by 1.3 
fold and the present calls by almost 6% in comparison to the "no PEG" control. Due to 
the high viscosity of the PEG solution, we have found that a final concentration of 16% 
PEG enhances array performance and is methodologically tractable. 

Example 9 

25 RNA Fragmentation: testing Mg 2 * hydrolysis parameters 

In order to optimize array performance, we examined different fragmentation 
buffers and the effect of fragment length on array performance. For the RELA method 
we tested the relationship between fragment length, array intensity and detection 
sensitivity. 

30 Because the downstream ligation reaction is affected by the fragmentation buffer, 

we examined buffers with lower monovalent ion concentrations and alternative cation 

26 



Attorney Docket No.: 3507.1 



compositions. Labeled and unlabeled cRNA was prepared from HeLa total RNA 
following standard expression protocols. Both labeled and unlabeled cRNAs were 
fragmented using Mg 2+ and high heat in the following buffers: a) 5X = 200 mM Tris- 
acetate, pH 8.1, 150 mM MgOAc, 500 mM KOAc (Affymetrix standard) b) 5X = 200 
5 mM Tris, 150 mM MgOAc, pH 8.2 c) 5X = 200 mM Tris, 150 mM MgC12, pH 8.2. The 
fragmented unlabeled cRNA was dephosphorylated with Shrimp Alkaline Phosphatase at 
37°C for 1 hour; followed by heat-inactivation at 65°C for 15 minutes. The 
dephosphorylated, fragmented cRNA was end-labeled with 100 uM RLR-6 at 37°C for 2 
hours in a reaction containing 2 U/ul T4 RNA Ligase, 16% PEG. For all reactions, ten 

1 0 micrograms of labeled cRNA were hybridized to Ul 33 A arrays and processed according 
to the standard antibody amplification protocol. 

For both RELA and STD, MgOAc was preferred over MgCh for the highest 
overall signal intensities and number of present calls. The standard cRNA fragmented 
with the Affymetrix commercial buffer performed the best by far. Fragmentation of the 

1 5 standard cRNA with the modified buffers significantly reduced both the number of 
present calls and signal intensity. For the RELA samples, the present call rates did not 
vary significantly between the different fragmentation buffers tested. However, the 
RELA samples fragmented with MgOAc containing buffers had higher signals than the 
sample which was fragmented with the MgCb buffer. 

20 Example 10 

End-labeling with multiple biotins: RLR-7, RLR-8 and RLR-9 

In accordance with one aspect of the present invention, the Sig moiety may have 
multiple biotin residues. In accordance with the present invention, it has been discovered 
that use of a nucleic acid labeling compound having multiple biotin residues to end label 

25 RNA has the potential of increasing target RNA signal as well as detection sensitivity. 
However, preliminary data indicates that there are limits to the number of biotin residues 
which can be incorporated into a Sig moiety and usefully employed to end label RNA for 
purposes of detection as described in accordance with the present invention. 

In regards to possible limits to the number of biotin moieties which may usefully 

30 be incorporated into a donor molecule, a donor molecule with five teg-biotins attached to 
the 3' position of the ribose (5'-pCp-(teg-biotin) 5 -3'), called RLR-7 was synthesized. In 
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preliminary experiments, RNA labeled with RLR-7 and hybridized to a GeneChip® array 
gave aberrant hybridization results. While the overall hybridization pattern of RNA 
labeled with RLR-7 is somewhat similar to those of the standard and of RLR-5, having 
one biotin, in many cases, however, RLR-7 hybridization misses areas where signal 
5 should be present and lights up areas which are not present in the standard. The 

significance, if any, of this preliminary data with RLR-7 is unknown at the present time. 

Donor molecules having less than five biotin moieties were prepared: RLR-8 (2 
biotins), and RLR-9 (3 biotins). RLR-9 gave the highest unsealed signal intensity. 
However, background intensity increases proportionately as signal increases. In the 
10 preliminary experiments performed, RLR-9 performed well compared to the other RNA 
labeling reagents being tested. Despite having the highest background, RLR-9 had the 
highest overall number of present calls compared to RLR-5 and RLR-8. 
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It is understood that the examples and embodiments described herein are for 
25 illustrative purposes only and that various modifications or changes in light thereof will 
be suggested to persons skilled in the art and are to be included within the spirit and 
purview of this application and scope of the appended claims. All publications, patents, 
and patent applications cited herein are hereby incorporated by references for all 
purposes. 
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